Technology described in this application relates to responding to queries for aggregated video and/or audio content that is found embedded in web pages. In particular, this technology relates to ranking of search results in response to text-based queries and compiling an index against which to search. Particular aspects of the technology are described in the drawings, specification and claims submitted herewith.
The amount of media available on the web is staggering and growing rapidly. Indexing content on the web is enormously challenging. Full courses are taught in major universities regarding search engines.
Retrieving video content presents new and different challenges for search engines. Some researchers have attempted to develop graphical search by example interfaces based on analysis of images and videos. While this may be promising for enforcement activities by the movie industry and TV studios or for face recognition in surveillance videos, it would not be practical to search a video hosting website such as YouTube using sketch-based queries. Most users expect to search for video using text searches.
Using text searches to find video content is much more challenging than using text searches to find documents. The content sought is of a different type than the metadata searched. Much video content, such as videos found on YouTube, is supplied by amateurs, without quality control of associated textual metadata.
While the corpus of videos and podcasts it is relatively modest, as a part of the whole Web content, it is expanding quickly. From 2004 through June, 2008, the assignee of this application expanded its indexing from 50,000 videos and podcasts to over 10 million items. In 2009 and 2010, this grew to 30 million and 50 million items.
With the explosion of video content, a new category of enterprise has emerged, sometimes called video aggregators and other times called video search engines. The video aggregators identified in the media include MeFeedia, Blinkx, VideoSurf, Pixsy, Yidio, CastTV and Veveo. These companies index videos across video content hosts and sites that embed links to the videos.
Accordingly, an opportunity arises to provide superior and improved search tools and indexes. In the sections that follow, we describe a ranking tool adapted specifically to video and/or audio content hosted on sites such as YouTube and NBC and distributed both through the hosted sites and RSS feeds.